Keyphrase Extraction using Sequential Labeling
نویسندگان
چکیده
Keyphrases efficiently summarize a document’s content and are used in various document processing and retrieval tasks. Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents. Most of these methods operate at a phrase-level and rely on part-of-speech (POS) filters for candidate phrase generation. In addition, they do not directly handle keyphrases of varying lengths. We overcome these modeling shortcomings by addressing keyphrase extraction as a sequential labeling task in this paper. We explore a basic set of features commonly used in NLP tasks as well as predictions from various unsupervised methods to train our taggers. In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing stateof-the-art extraction methods.
منابع مشابه
WING-NUS at SemEval-2017 Task 10: Keyphrase Identification and Classification as Joint Sequence Labeling
We describe an end-to-end pipeline processing approach for SemEval 2017’s Task 10 to extract keyphrases and their relations from scientific publications. We jointly identify and classify keyphrases by modeling the subtasks as sequential labeling. Our system utilizes standard, surface-level features along with the adjacent word features, and performs conditional decoding on whole text to extract...
متن کاملIncorporating Expert Knowledge into Keyphrase Extraction
Keyphrases that efficiently summarize a document’s content are used in various document processing and retrieval tasks. Current state-of-the-art techniques for keyphrase extraction operate at a phrase-level and involve scoring candidate phrases based on features of their component words. In this paper, we learn keyphrase taggers for research papers using token-based features incorporating lingu...
متن کاملImproved Keyword and Keyphrase Extraction from Meeting Transcripts
Keywords play a vital role in extracting the correct information as per user requirements. Keywords are like index terms that contain the most important information about the content of the document. Keyword Extraction is the task of identifying a keyword or keyphrase from a document that can help users easily to understand the documents. Meeting transcripts is significantly different from docu...
متن کاملCoherent Keyphrase Extraction via Web Mining
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge number of documents that do not have manually ...
متن کاملCorePhrase: Keyphrase Extraction for Document Clustering
The ability to discover the topic of a large set of text documents using relevant keyphrases is usually regarded as a very tedious task if done by hand. Automatic keyphrase extraction from multi-document data sets or text clusters provides a very compact summary of the contents of the clusters, which often helps in locating information easily. We introduce an algorithm for topic discovery using...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1608.00329 شماره
صفحات -
تاریخ انتشار 2016